Markov Chain Monte Carlo Method without Detailed Balance 
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We present a specific algorithm that generally satisfies the balance condition without imposing 
the detailed balance in the Markov chain Monte Carlo. In our algorithm, the average rejection 
rate is minimized, and even reduced to zero in many relevant cases. The absence of the detailed 
balance also introduces a net stochastic flow in a configuration space, which further boosts up the 
convergence. We demonstrate that the autocorrelation time of the Potts model becomes more than 
6 times shorter than that by the conventional Metropolis algorithm. Based on the same concept, a 
bounce-free worm algorithm for generic quantum spin models is formulated as well. 

PACS numbers: 02.70.Tt, 05.10.Ln, 02.50.-r, 02.70.Ss 



The Markov chain Monte Carlo (MCMC) method, 
which is a vital tool for investigating almost all kinds 
of statistical problems, especially systems with multiple 
degrees of freedom, is being applied extensively across 
the various disciplines, such as statistics, economics, and 
bioinformatics, not to mention physics |l|-|3|. In the 
MCMC method for a stationary distribution, the bal- 
ance condition [BC, Eq. (J2J) below] and the ergodicity 
must be imposed; the former ensures the invariance of a 
target distribution, while the latter does the convergence 
to the invariant 0, [B| . Although an MCMC method sat- 
isfying these two conditions guarantees unbiased results 
in infinite time in principle, a rapid relaxation, that is, a 
short correlation time, is essential for the method to work 
in practice. The longer the autocorrelation time is, the 
larger the error bar becomes. In order to achieve weaker 
correlations in a Markov sequence, a number of variants 
have been invented so far, e.g., the Swendsen-Wang al- 
gorithm [6| and the loop algorithm [7| ■ The extended en- 
semble methods, such as the multicanonical method [8] 
and the exchange Monte Carlo method [9] , have also been 
proposed and applied successfully to protein folding prob- 
lems, spin glasses, etc. 

In most practical implementations of the MCMC 
method, the detailed balance condition (DBC), the re- 
versibility, is imposed, where every elementary transition 
is forced to balance with a corresponding inverse pro- 
cess. Thanks to the DBC, it becomes easy to find qual- 
ified transition probabilities. In the meantime, it has 
long been considered difficult to satisfy the BC without 
imposing the DBC, and attempts to reduce autocorre- 
lations in Markov sequences have concentrated on opti- 
mizing transition probabilities within the DBC [id lllj. 
Here, we need to be reminded that the DBC is not a nec- 
essary condition for the invariance. The BC, which is a 
weaker condition than the DBC, is mathematically suf- 
ficient 0, Q. In fact, the DBC is often broken secretly, 
even though the DBC is used apparently to define the 
transition probabilities. The single spin update in a clas- 
sical system is such an example. The random update, 
where a spin to be flipped is chosen uniformly randomly 
among all spins, satisfies the DBC strictly. On the other 



hand, the DBC is broken in the sequential update, where 
spins are swept in a fixed order. The DBC is satisfied 
only locally, that is, only in each spin flip, and the BC is 
eventually fulfilled in one sweep 12 1. 

In this Letter, we present a simple and versatile al- 
gorithm to find a set of transition probabilities in the 
MCMC method, which fulfills the BC but breaks the 
DBC even locally. As the BC is a weaker condition than 
the DBC, the solution space of transition probabilities is 
enlarged, and rejection rates can be reduced as the re- 
sult. We show that by the present algorithm the average 
rejection rate is indeed minimized, and even reduced to 
zero in many relevant cases. Furthermore, breaking the 
DBC introduces a net stochastic flow in the configura- 
tion space. It will boost up the relaxation further by 
suppressing random walk behavior [13l - [l5j [26| . 

In what follows, after describing our specific algo- 
rithm for finding a rejection-minimized solution, we will 
demonstrate its effectiveness in the single spin update of 
the Potts model [16| and the worm (directed-loop) up- 
date [13, [l8[ of the quantum Heisenberg spin chain in 
a magnetic field. In both cases, it is established that 
our algorithm boosts up the relaxation significantly in 
comparison with the conventional algorithms, such as 
the Metropolis (also called Metropolis-Hastings) algo- 
rithm 19, 20], or the heat bath algorithm (Gibbs sam- 
pler) [23, [22I ■ Especially, a bounce- free (rejection- free) 
worm algorithm can be formulated for generic quantum 
spin models, by which the autocorrelation time is often 
reduced by orders of magnitude as we will see below. 

We start with considering an elementary update pro- 
cess, e.g., flipping a single spin in the Ising or Potts mod- 
els, or choosing an exit at an operator in the worm al- 
gorithm (see below). Given an environmental configura- 
tion, we would have n candidates (including the current 
one) for the next configuration. The weight of each candi- 
date configuration (or state) is given by Wi (i = 1, ..., n), 
to which the equilibrium probability distribution is pro- 
portional. Although in the formulation of the MCMC 
method, the BC (or the DBC) is usually expressed in 
terms of the weights {wi} and the transition probabili- 
ties {pi_>j}, it is more convenient to introduce a quan- 
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FIG. 1: Example of weight landfill by the Metropolis and heat 
bath algorithms for n — 2. The amount of landfilled weight, 
Vi-^j, is defined by Eqs. ((3]) and Q, respectively. The regions 
with thick frame denote the rejection rates. 



tity Vi-^j := WiPi^j, which corresponds to the amount 
of (raw) stochastic flow from state i to j. The law of 
probability conservation and the BC are then expressed 

as 
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(2) 



respectively. 

that {vi^j} satisfy 



The average rejection rate is written as 
Wi. Also, it is straightforward to confirm 



'H-tj = -miii(wi,Wj) 

n-1 



i^J 



for the Metropolis algorithm, and 
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for the heat bath algorithm, where the detailed balance, 
i.e., the absence of net stochastic flow, is manifested 
by the symmetry under the interchange of the indices: 

Our task is to find a set {vi->j} that minimizes the 
average rejection rate while satisfying Eqs. ([1]) and ©. 
This procedure can be understood visually as weight 
landfill, where we move (or allocate) some amount of 
weight (wi-s-j) from state i to j keeping the entire shape 
of the weight boxes intact. For catching on this land- 
fill picture, let us think at first the case with n = 2 as 
in the single spin update of the Ising model. Figure Q] 
shows the landfill when the Metropolis and heat bath al- 
gorithms [Eqs. ((3]) and (0J] are applied, where the average 
rejection rate (ex wi-j-i +«2->2) clearly remains finite. In- 
deed, for n = 2 the Metropolis algorithm gives the best 
solution, i.e., the minimum average rejection rate even 
within the BC [see Eq. © below]. 

For n > 3, on the other hand, we can get ahead by 
breaking with the DBC. In Fig. f2J we show examples 
of weight landfill in the case with n = 4 by using the 
Metropolis and the heat bath algorithms together with 
a solution that does not satisfy the DBC. The first two 
again have finite rejection rates, while the last is rejec- 
tion free (i.e., t>j_n = Vi). Although a solution is not 
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FIG. 2: Example of weight landfill by the Metropolis, heat 
bath, and the proposed algorithms for n — 4. In the present 
algorithm, first the maximum weight (tui) is allocated to the 
second box. It saturates the second box, and the remainder is 
all put into the third one (first row). Next, TO2 is allocated to 
the partially filled box and the subsequent box (second row). 
The same procedure is repeated for w^ and IU4. This proposed 
algorithm is rejection free, while there remain finite rejection 
rates in the others as indicated by the thick frames. 



unique obviously, we propose the following procedure as 
a concrete algorithm to find a solution for general n. (i) 
Choose a configuration with maximum weight. If two 
or more configurations have the same maximum weight, 
choose one of them. In the following, we assume wi is 
the maximum without loss of generality. The order of 
the remaining weights does not matter, (ii) Allocate the 
maximum weight w\ to the next box (i — 2). If the 
weight still remains after saturating the box, reallocate 
the remainder to the next (i = 3). Continue until the 
weight is all allocated, (iii) Allocate the weight of the 
first landfilled box (W2) to the last partially filled box 
in step ii). Continue the allocation likewise, iv) Repeat 
step iii) for u>3, W4, ...,w n . Once all the boxes with i > 2 
are saturated, landfill the first box (i = 1) afterward. 

In the above procedure, all the boxes are landfilled 
without any space; that is, it satisfies the BC [Eq. ||2J)]. 
Since the BC is satisfied in each elementary transition, 
it is fulfilled in one sweep as well. It is also clear that 
the second and subsequent boxes must be already satu- 
rated when the allocation of its own weight is initiated, 
since w± is the maximum. By this procedure, {vi^j } are 
determined as 



where 



= max(0, min(A 
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+ Wj-Aij,Wi,Wj)), (5) 
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FIG. 3: Autocorrelation time of the square of order parameter 
near the transition temperature (T ~ 0.910 and 0.745, respec- 
tively) in the 4-state (left) and 8-state (right) Potts models 
by the Metropolis (circles), heat bath (triangles), LOU (dia- 
monds), and present (squares) methods. The system size is 
16 x 16. The error bars are the same order with the point 
sizes. 



Especially, from Eq. (|5|) we obtain 



I max(0, 2wi — S n ) i = 1 
IfJ i>2. 

That is, a rejection- free solution can be obtained, if 



0) 



k=l 



tl'k 



(10) 



is satisfied. On the contrary, when inequality (jT0|) is 
not satisfied, one has to necessarily assign the maximum 
weight to itself, as it is bigger than the sum of the rest. 
Thus, the present solution is optimal within the BC, in 
the sense that it minimizes the average rejection rate. 

We close the introduction of our algorithm with a note 
about the ergodicity. It is far from trivial to prove that 
the present algorithm, as well as the Metropolis algo- 
rithm, satisfies the ergodicity, since many of the transi- 
tion probabilities become zero exactly. In principle, how- 
ever, one can always ensure the ergodicity by combining 
the present algorithm with the heat bath method. An- 
other way to ensure the ergodicity is randomly choosing 
one of the sets of transition probabilities obtained by dif- 
ferent landfill order, though we have not observed any 
glimpse of ergodicity breaking in the following simula- 
tions. 

In order to assess the effectiveness of the present al- 
gorithm, we investigate the autocorrelations in the ferro- 
magnetic g-state Potts models on the square lattice [16| , 
which exhibit a continuous (q < 4) or first-order (q > 4) 
phase transition at T = l/ln(l + ^/q). We calculate 
the autocorrelation time of the square of order param- 
eter for q = 4 and 8 by several algorithms. The auto- 
correlation time Ti nt is estimated through the relation: 
a 1 = (1 + 2Tj nt )(To, where <Tg is the variance of the raw 
time series data and a 1 is the true variance calculated 
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FIG. 4: Extension of the worm-going pathway in the S = 
1/2 model. Here, dashed (solid) vertical lines denote spin 
up (down). The head of worm (open circle) moves on the 
worldline (a), and scatters at the operator (horizontal thick 
line). As candidate configurations, we introduce operator- 
flip updates (d)-(e) in addition to the conventional ones (a)- 
(c). Note that in (e) the position of the operator is shifted 
simultaneously in contrast to the simple bounce process (a). 



from the binned data using a bin size much larger than 
the Tint [l|. In Fig. [31 it is clearly seen that our algo- 
rithm significantly boosts up the relaxation in both mod- 
els in comparison with the conventional methods. In the 
4-state Potts model, the autocorrelation time becomes 
nearly 6.4 times shorter than that by the Metropolis al- 
gorithm, 2.7 times than the heat bath algorithm, and 
even 1.4 times than the locally optimal update (LOU) 



by Pollet et al. jllj, which is considered as one of the 



best solutions within the DBC. Furthermore, the present 
algorithm is increasingly advantaged as q increases. 

Next, we move onto the quantum Monte Carlo meth- 
ods. The worm algorithm for quantum spin and lattice 
boson models is formulated based on either the Euclidean 
pathintcgral or the high-temperature series |17l ll8| . One 
Monte Carlo sweep of the worm algorithm consists of 
the diagonal update, where operators are inserted or re- 
moved without changing the shape of worldlines, and the 
off-diagonal update, where the worldlines (and the type 
of operators) are updated with keeping the position of 
operators unchanged. In the latter process, a pair of cre- 
ation and annihilation operators, which is called a worm, 
is inserted on a worldline (pair creation), and one of them 
(called the head) is moved stochastically until the head 
and the tail destroy each other (pair annihilation). As a 
thorny problem, a bounce process, where the head just 
backtracks and cancels the last update, has been gener- 
ally inevitable within the DBC. Here, as an example, we 
consider the S =1/2 antiferromagnetic XXZ model: 



H - 2_^ (SfSJ 
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(11) 
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where we introduce an arbitrary parameter C controlling 
the ratio between the diagonal and off-diagonal weights. 
In the head scattering process at an operator, only three 
among four exits have a nonzero weight due to the conser- 
vation of the total S z [(a)-(c) in Fig. 0] . At the Heisen- 
berg point (A = 1), there remain finite bounce probabil- 
ities except at ft = within the DBC [H H |23| . Unfor- 
tunately, the situation does not improve much even in the 
BC, because the total number of candidates is too small. 
However, the condition (|10p provides us a clear prospect; 
by increasing the number of candidates, a bounce-free al- 




FIG. 5: Magnetic field dependence of magnetization (upper) 
and autocorrelation time (lower) of the S = 1/2 antiferromag- 
netic Heisenberg chain (L = 64, T — 1/2L). The maximum 
autocorrelation time is 1.0 x 10 3 by the worm update with 
Metropolis (circles), 9.8 x 10 2 by the worm with heat bath 
(triangles), 3.3 x 10 2 by the LOU (diamonds), an improve- 
ment of the directed loop, and 1.7 x 10 2 by the present algo- 
rithm (squares). By the bounce- free worm with the operator 
flip, Ti n t is further reduced down to 8.1 (solid squares). 



gorithm will be realized. According to this strategy, we 
introduce operator-flip updates, where sites on which an 
operator acts are shifted simultaneously (Fig. @|. By the 
operator flip together with the constant C chosen as 



C = max (t(2A + 3h-l), -(A + 3/1+1 



(12) 



we can actually eliminate the bounce process absolutely. 

The autocorrelation data of the magnetization in the 

Heisenberg chain (A = 1) are shown in Fig. [5] Amaz- 



ingly, the bounce-free worm algorithm with the opera- 
tor flip is faster by about 2 orders of magnitude than 
the Metropolis and the heat bath algorithms. Also in 
high-5 spin systems, the bounce-free worms can be con- 
structed by representing the partition function by de- 
composed 5 = 1/2 spins [24l | . Our idea of breaking the 
DBC and operator-flip updates are also applied to gen- 
eral bosonic models effectively, because bosonic worms 
get bounce-minimized with more candidates. 

In the present study, we have developed a simple and 
versatile MCMC algorithm that generally satisfies the 
BC without imposing the DBC. In our algorithm, the 
average rejection rate gets minimized, which reduces the 
autocorrelation time significantly in comparison with the 
conventional methods based on the DBC. We also have 
introduced operator-flip updates in the worm algorithm, 
yielding a bounce-free algorithm in generic spin models. 
The present concept can be naturally extended to sys- 
tems with continuous state variables by replacing the 
weight landfill presented here with an asymmetric ran- 
dom cyclic shifting in a cumulative probability distribu- 
tion function. Our approach beyond the DBC can be 
universally applied to any MCMC methods, even to inge- 
nious established methods, such as the cluster algorithms 
and the extended ensemble methods, and will undoubt- 
edly improve the relaxation in Markov sequences. 

Most simulations were performed on T2K Supercom- 
puter at University of Tsukuba. T he prog ram was devel- 
oped based on the ALPS library [24|, |25| ■ We acknowl- 
edge support by Grant-in- Aid for Scientific Research Pro- 
gram (No. 20540364) from JSPS, and by Grand Chal- 
lenges in Next-Generation Integrated Nanoscience, Next- 
Generation Supercomputer Project from MEXT, Japan. 
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